## X Index Title Artist TopGenre Year
## 1 1 1794 Sacrifice Anouk dutch indie 1998
## 2 2 774 Hou Vol Hou Vast BLØF dutch pop 2018
## 3 3 625 Three Days In A Row Anouk dutch indie 2015
## 4 4 412 Peter Gunn Theme Emerson, Lake & Palmer album rock 2010
## 5 5 606 Het Dorp - Live Wim Sonneveld dutch cabaret 2015
## 6 6 754 Malle Babbe Rob De Nijs dutch pop 2018
## Beats.Per.Minute..BPM. Energy Danceability Loudness..dB. Liveness Valence
## 1 136 17 54 -13 9 24
## 2 86 61 51 -5 8 23
## 3 171 50 36 -6 16 39
## 4 131 83 43 -7 92 71
## 5 114 44 37 -15 67 45
## 6 87 38 35 -10 12 53
## Length..Duration. Acousticness Speechiness Popularity Abbrev
## 1 238 74 5 11 alternative
## 2 295 0 2 12 pop
## 3 254 0 3 13 alternative
## 4 217 1 3 14 rock
## 5 198 82 8 15 other
## 6 253 73 4 15 pop
Exploratory Data Analysis

## BPM Energy Danceability dB Liveness
## BPM 1.0000000000 0.15664444 -0.14060233 0.09292650 0.01625639
## Energy 0.1566444353 1.00000000 0.13961627 0.73571088 0.17411770
## Danceability -0.1406023300 0.13961627 1.00000000 0.04423531 -0.10306258
## dB 0.0929265007 0.73571088 0.04423531 1.00000000 0.09825705
## Liveness 0.0162563857 0.17411770 -0.10306258 0.09825705 1.00000000
## Valence 0.0596532230 0.40517478 0.51456376 0.14704112 0.05066664
## Duration 0.0062516715 0.02280040 -0.13543160 -0.05612653 0.03249854
## Acousticness -0.1224718133 -0.66515636 -0.13576888 -0.45163499 -0.04620551
## Speechiness 0.0855982110 0.20586499 0.12522900 0.12508975 0.09259447
## popularity_cat -0.0005116562 0.07732221 0.10725015 0.12941612 -0.08073505
## genre_cat_Num -0.0093962893 0.01656957 0.07082700 0.07949270 0.03985214
## Valence Duration Acousticness Speechiness popularity_cat
## BPM 0.05965322 0.006251671 -0.122471813 0.08559821 -0.0005116562
## Energy 0.40517478 0.022800396 -0.665156355 0.20586499 0.0773222082
## Danceability 0.51456376 -0.135431600 -0.135768879 0.12522900 0.1072501519
## dB 0.14704112 -0.056126527 -0.451634993 0.12508975 0.1294161168
## Liveness 0.05066664 0.032498536 -0.046205511 0.09259447 -0.0807350510
## Valence 1.00000000 -0.203689536 -0.239729075 0.10710188 0.0987226169
## Duration -0.20368954 1.000000000 -0.102318918 -0.02782584 -0.0619802162
## Acousticness -0.23972907 -0.102318918 1.000000000 -0.09825610 -0.0573931780
## Speechiness 0.10710188 -0.027825837 -0.098256101 1.00000000 0.0829758481
## popularity_cat 0.09872262 -0.061980216 -0.057393178 0.08297585 1.0000000000
## genre_cat_Num -0.06341676 -0.072497611 -0.006356929 0.10494913 -0.0567017949
## genre_cat_Num
## BPM -0.009396289
## Energy 0.016569567
## Danceability 0.070827001
## dB 0.079492696
## Liveness 0.039852139
## Valence -0.063416760
## Duration -0.072497611
## Acousticness -0.006356929
## Speechiness 0.104949133
## popularity_cat -0.056701795
## genre_cat_Num 1.000000000
## BPM Energy Danceability dB Liveness
## 1.074835 4.102349 1.511429 2.432290 1.072964
## Valence Duration Acousticness Speechiness genre_cat_Num
## 1.851642 1.110300 1.870871 1.083343 1.047772
Part 1: Full Dataset Models
Linear Discriminant Analysis (LDA) on Full Dataset
## Time difference of 0.008840084 secs
##
## lda.class 0 1
## 0 567 398
## 1 424 605
## [1] 0.5877633
## [1] 0.4122367
Quadratic Discriminant Analysis (QDA) on Full Dataset
## Time difference of 0.006765842 secs
##
## qda.class 0 1
## 0 498 320
## 1 493 683
## [1] 0.5922768
## [1] 0.4077232
Logistic Regression on Full Dataset
## Time difference of 0.005219936 secs
##
## glm.pred 0 1
## 0 567 398
## 1 424 605
## [1] 0.5877633
## [1] 0.4122367
K-Nearest Neighbors (KNN) on Full Dataset
## Time difference of 0.04465508 secs
##
## knn.pred 0 1
## 0 687 352
## 1 304 651
## [1] 0.671013
## [1] 0.328987
## Time difference of 0.03402019 secs
##
## knn.pred3 0 1
## 0 763 244
## 1 228 759
## [1] 0.7632899
## [1] 0.2367101
## Time difference of 0.0342679 secs
##
## knn.pred5 0 1
## 0 718 309
## 1 273 694
## [1] 0.7081244
## [1] 0.2918756
## Time difference of 0.04007506 secs
##
## knn.pred10 0 1
## 0 662 355
## 1 329 648
## [1] 0.6569709
## [1] 0.3430291
Part 2: Training and Test Split Models
Linear Discriminant Analysis (LDA) with Train/Test Split
## Time difference of 0.00903821 secs
##
## lda.class2 0 1
## 0 266 207
## 1 227 297
## [1] 0.5646941
## [1] 0.4353059
Quadratic Discriminant Analysis (QDA) with Train/Test Split
## Time difference of 0.005118847 secs
##
## qda.class2 0 1
## 0 197 151
## 1 296 353
## [1] 0.551655
## [1] 0.448345
Logistic Regression with Train/Test Split
## Time difference of 0.00689292 secs
##
## glm.pred2 0 1
## 0 269 207
## 1 224 297
## [1] 0.5677031
## [1] 0.4322969
K-Nearest Neighbors (KNN) with Train/Test Split
## Time difference of 0.01145697 secs
##
## knn.predT7 0 1
## 0 253 244
## 1 240 260
## [1] 0.5145436
## [1] 0.4854564
## Time difference of 0.01071215 secs
##
## knn.predT3 0 1
## 0 245 261
## 1 248 243
## [1] 0.4894684
## [1] 0.5105316
## Time difference of 0.01069403 secs
##
## knn.predT5 0 1
## 0 246 250
## 1 247 254
## [1] 0.5015045
## [1] 0.4984955
## Time difference of 0.0113771 secs
##
## knn.predT10 0 1
## 0 257 243
## 1 236 261
## [1] 0.5195587
## [1] 0.4804413
Part 3: 5-Fold Cross-Validation
Logistic Regression with 5-Fold CV
## Time difference of 0.123796 secs
## [1] 0.5914787 0.5313283 0.5864662 0.5513784 0.5804020
## [1] 0.5682107
## [1] 0.01154756

## TRUTH
## OUTPUT 0 1
## 0 544 414
## 1 447 589
LDA with 5-Fold CV
## Time difference of 0.0616281 secs
## [1] 0.5939850 0.5313283 0.5864662 0.5538847 0.5804020
## [1] 0.5692132
## [1] 0.0116334

## TRUTH
## OUTPUT 0 1
## 1 541 409
## 2 450 594
QDA with 5-Fold CV
## Time difference of 0.05892515 secs
## [1] 0.5689223 0.5513784 0.5839599 0.5964912 0.5603015
## [1] 0.5722107
## [1] 0.00810621

## TRUTH
## OUTPUT 0 1
## 1 492 354
## 2 499 649
KNN with 5-Fold CV
## Time difference of 0.04586816 secs
## [1] 0.5087719 0.5187970 0.5213033 0.5413534 0.5025126
## [1] 0.5185476
## [1] 0.006634929

## TRUTH
## OUTPUT 0 1
## 1 531 500
## 2 460 503
## Time difference of 0.04767895 secs
## [1] 0.5288221 0.5363409 0.5187970 0.5463659 0.4899497
## [1] 0.5240551
## [1] 0.009649504

## TRUTH
## OUTPUT 0 1
## 1 547 505
## 2 444 498
## Time difference of 0.04560113 secs
## [1] 0.5413534 0.5288221 0.5288221 0.5238095 0.5025126
## [1] 0.5250639
## [1] 0.006339286

## TRUTH
## OUTPUT 0 1
## 1 547 503
## 2 444 500
## Time difference of 0.05186296 secs
## [1] 0.5488722 0.5162907 0.5037594 0.5187970 0.5050251
## [1] 0.5185489
## [1] 0.008143353

## TRUTH
## OUTPUT 0 1
## 1 542 511
## 2 449 492
Results Summary
Full Dataset Results
## LDA QDA Logistic Regression KNN k=3 KNN k=5 KNN k=7 KNN k=10
## Accuracy 0.588 0.581 0.588 0.769 0.718 0.682 0.663
## Sensitivity 0.603 0.681 0.603 0.757 0.692 0.649 0.646
## Specificity 0.572 0.503 0.572 0.770 0.725 0.693 0.668

Train/Test Split Results
## LDA QDA Logistic Regression KNN k=3 KNN k=5 KNN k=7 KNN k=10
## Accuracy 0.567 0.544 0.495 0.507 0.520 0.525 0.490
## Sensitivity 0.589 0.700 0.482 0.504 0.516 0.518 0.512
## Specificity 0.499 0.400 0.497 0.499 0.513 0.521 0.454

5-Fold Cross-Validation Results
## LDA QDA Logistic Regression KNN k=3 KNN k=5 KNN k=7 KNN k=10
## Accuracy 0.569 0.565 0.569 0.525 0.529 0.528 0.528
## Sensitivity 0.592 0.647 0.587 0.504 0.500 0.500 0.498
## Specificity 0.546 0.496 0.549 0.538 0.549 0.547 0.550

Comparative Boxplot of 5-Fold CV Results
## Loading required package: gplots
## Registered S3 method overwritten by 'gplots':
## method from
## reorder.factor gdata
##
## Attaching package: 'gplots'
## The following object is masked from 'package:stats':
##
## lowess
